Upsortable: Programming TopK Queries Over Data Streams

نویسندگان

Julien Subercaze

Christophe Gravier

Syed Gillani

Abderrahmen Kammoun

Frédérique Laforest

چکیده

Top-k queries over data streams is a well studied problem. There exists numerous systems allowing to process continuous queries over sliding windows. At the opposite, nonappend only streams call for ad-hoc solutions, e.g. tailormade solutions implemented in a mainstream programming language. In the meantime, the Stream API and lambda expressions have been added in Java 8, thus gaining powerful operations for data stream processing. However, the Java Collections Framework does not provide data structures to safely and conveniently support sorted collections of evolving data. In this paper, we demonstrate Upsortable, an annotation-based approach that allows to use existing sorted collections from the standard Java API for dynamic data management. Our approach relies on a combination of pre-compilation abstract syntax tree modifications and runtime analysis of bytecode. Upsortable offers the developer a safe and time-efficient solution for developing top-k queries on data streams while keeping a full compatibility with standard Java.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

ارائه روشی پویا جهت پاسخ به پرس‌وجوهای پیوسته تجمّعی اقتضایی

Data Streams are infinite, fast, time-stamp data elements which are received explosively. Generally, these elements need to be processed in an online, real-time way. So, algorithms to process data streams and answer queries on these streams are mostly one-pass. The execution of such algorithms has some challenges such as memory limitation, scheduling, and accuracy of answers. They will be more ...

متن کامل

Reporting l most influential objects in uncertain databases based on probabilistic reverse top-k queries

Reverse topk queries are proposed from the perspective of a product manufacturer, which are essential for manufacturers to assess the potential market. However, the existing approaches for reverse topk queries are all based on the assumption that the underlying data are exact (or certain). Due to the intrinsic differences between uncertain and certain data, these methods cannot be applied to pr...

متن کامل

Top-k Dominating Queries: a Survey

Top-k dominating queries combine the advantages of top-k queries and skyline queries, and eliminate their disadvantages. They return k objects with the highest domination score, which is defined as the number of dominated objects. As a top-k query, the user can bound the number of returned results through the parameter k, and like a skyline query a user-selected scoring function is not required...

متن کامل

CrowdK: Answering top-k queries with crowdsourcing

In recent years, crowdsourcing has emerged as a new computing paradigm for bridging the gap between humanand machine-based computation. As one of the core operations in data retrieval, we study topk queries with crowdsourcing, namely crowd-enabled topk queries . This problem is formulated with three key factors, latency, monetary cost , and quality of answers . We first aim to design a novel fr...

متن کامل

Handling ER-topk Query on Uncertain Streams

Data uncertainty widely exists in many applications. In this paper, we aim at handling top-k queries on uncertain data streams. Since the volume of a data stream is unbounded whereas the memory resource is limited, it is critical to devise one-pass solutions that is both timeand space efficient. In this paper, we use two structures to handle this issue. The DomGraph stores all tuples that are p...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

PVLDB

دوره 10 شماره

صفحات -

تاریخ انتشار 2017

Upsortable: Programming TopK Queries Over Data Streams

نویسندگان

چکیده

منابع مشابه

ارائه روشی پویا جهت پاسخ به پرس‌وجوهای پیوسته تجمّعی اقتضایی

Reporting l most influential objects in uncertain databases based on probabilistic reverse top-k queries

Top-k Dominating Queries: a Survey

CrowdK: Answering top-k queries with crowdsourcing

Handling ER-topk Query on Uncertain Streams

عنوان ژورنال:

اشتراک گذاری